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' Hastings recently reported a randomized construction of channels violating the mini- 

fvq . mum output entropy additivity conjecture. Here we revisit his argument, presenting a 

' simplified proof. In particular, we do not resort to the exact probability distribution of the 

Schmidt coefficients of a random bipartite pure state, as in the original proof, but rather 
derive the necessary large deviation bounds by a concentration of measure argument. Fur- 
on : thermore, we prove non-additivity for the overwhelming majority of channels consisting 

of a Haar random isometry followed by partial trace over the environment, for an environ- 
r— i . ment dimension much bigger than the output dimension. This makes Hastings' original 

' reasoning clearer and extends the class of channels for which additivity can be shown to be 

^p H . violated. 
> ■ 

c ■ 
ctf ■ 
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I. INTRODUCTION 

The oldest problem in quantum information theory is probably the determination of the ca- 
pacity of a quantum-mechanical channel for classical information transmission. Given a quan- 
{Sj , turn channel from a sender to a receiver, characterized by a trace preserving completely positive 

map £ , its classical capacity is defined as the maximum number of bits which can be reliably 
' sent per use of the channel, in the limit of infinitely many realizations of the channel. Holevo 

J3] and Schumacher-Westmoreland 10] proved the following formula for the classical information 
Q \ transmission capacity: 

O i C{£) = X °°(S) := lim ^ >-, (1) 

IC'Ni n— >oo n 

where the Holevo x-quantity fsj] is defined by 

X{£) ■= max S[£ I J^Pipi - YVS (£ (pi)) , (2) 
{p i>Pi } V W JJi 

with S being the von Neumann entropy and the maximization ranging over all ensembles {pi, pi}. 

An important question concerning the capacity formula given by Eq. is whether the reg- 
ularization of the \ quantity to infinitely many uses of the channel is really needed in the right- 
hand-side of Eq. Q}. Indeed, such necessity would render the evaluation of the formula given 
by Eq. (T]) in general intractable; moreover, it would show that we do not fully understand the 
structure of the optimal coding strategy, since from Eq. we cannot say anything about the - 
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in general entangled - states p. t appearing in the optimal ensemble. On a more positive note, the 
need of regularization would also show that we can boost the information transmission capacity 
by using entangled encoding states. 

Based on numerical evidence in low dimensions and several results for particular classes of 
channels (e.g. 0,11,11, SB SEl El El, EE QUI), it was conjectured the x-quantity is additive, 



i.e. for every pair of channels £\, £2, 

X (£ 1 ®£ 2 )= x (£ 1 ) +X (£ 2 ). (3) 

The validity of this conjecture would imply that the classical capacity of a quantum channel is 
given simply by its Holevo x-quantity which would constitute a single-letter formula for the ca- 
pacity. It turns out that Eq. 10 is in fact equivalent to the to the non-necessity of the limit in Eq. (l} 
Il6f| : C{£) = x(£lf° r every channel £ if, and only if, Eq. <(3j) holds true for every pair of channels 



£ 1, £ 2 (see also [17|]). 

The additivity of the x-quantity can be related to other additivity questions. The first concerns 
the entanglement cost of a bipartite quantum state p shared by Alice and Bob. It is defined as 
the optimal rate of EPR pairs needed for the formation of p, in the limit of asymptotically many 
copies of the state, by local operations and classical communication between Alice and Bob. It 
was shown in JH] that the entanglement cost is given by 

E c (p) := Hm * FKP \ (4) 

ra^oo n 

where the entanglement of formation is defined as 

E F {p) := min (tr A (|^>(^|)) , (5) 



with the minimization taken over all pure state ensembles of p. As shown by Shor in Ref. [20] 



(building on |2ll l22l, 12311 ), the additivity of the entanglement of formation is equivalent to the 
additivity of x as given by Eq. (3). 

The second additivity question concerns the distillable common randomness of a bipartite 
state, given by the optimal rate of maximally correlated classical bits that can be extracted from 
a bipartite quantum state, when classical communication is allowed from Alice to Bob (the rate 
of bits communicated being subtracted from the rate of maximally correlated bits obtained in the 
end of the protocol). Devetak and Winter proved that I24I1 

p := lim KP \ (6) 

n— >oo n 



with 



I~*(P) '■= S(pa) ~ y^PiS{pi) , (7) 

{Mi} \ 



where the maximization runs over POVMs £Mj} applied to Alice's system, pi := tr(p(Mj <g) I)) 
and pi := tr^(p(Mj (8) 0]- m Ref. 10] Koashi and Winter derived a beautiful relation 

between the entanglement of formation and the quantity given in Eq. (0, showing in particular 
the equivalence of the need of the limit in Eq. (O to the validity of Eq. 10 for every pair of 
channels. 
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An important simplification of the additivity problem, due to Shor 112011 , shows that the addi- 
tivity of the y-quantity is equivalent to a simpler question: the additivity of the minimum output 
entropy, defined as [200 

S Bda (S):=mmS(S(j>)). (8) 
p 

It turns out that Eq. © holds true if, and only if, for every pair of channels £ i , £2 

£min(£l <S> £2) = SminO^l) + S'min^)- (9) 



Recently, based on similar results on Renyi entropies by Winter [29j] and Hay den |28f l (see also 
1 3ol, 31, l3pl), Hastings proved the breakthrough result that the minimum output entropy is not 
additive [27]]: in general, Eq. (O does not hold true. This in turn implies that the limits in Eqs. 
((U, and <© are needed and thus that we are unfortunately further away from grasping these three 
capacities than what we might have expected. 



Hastings argument combines the approach of Winter |29j] and Hayden |28fl to the problem 
with powerful new ideas and techniques to construct randomized examples of channels violating 
Eq. ((9]). In particular, his argument is heavily based on an exact expression for the eigenvalue 



probability distribution of the reduced density matrix of a Haar distributed bipartite state [33]. 
The main goal of the present paper is to revisit Hastings' proof by employing instead more gen- 
eral properties of the Haar distribution, such as large deviations bounds for the concentration of 
well-behaved functions around their mean-values in high dimensions. This allows us to present 
the proof in a relatively concise form. Moreover, we will be able to strengthen slightly Hastings' 
result and prove non-additivity of the overwhelming majority of Haar random channels (for ap- 
propriate input, output, and environment dimensions). As a by-product, we also obtain a new 
result concerning the concentration of measure phenomenon in high dimensional quantum states, 
which may be of independent interest. 

We would like to refer the reader to an earlier paper by Fukuda, King, and Moser of a similar 
spirit |34f|. where Hastings' original argument is explained in great detail and rigor. In particular, 
the authors derived explicit lower bounds to the input, output and environment dimensions for 
which channels violating additivity can be constructed. Our approach is unlikely to provide better 
estimates than the ones found in Ref. 13411 , as it does not rely on the exact probability distribution 
of the Schmidt coefficients of a Haar bipartite state. However, as our proof differs from the original 
in a few places, the optimization of the dimensions in our version of the proof may still be an 
interesting task (which we do not pursue here however). 

Notation: We denote the set of density matrices acting on a Hilbert space H by D(H). More- 
over, we will often write A and B for finite dimensional Hilbert spaces, A (g> B or AB for their 
tensor product, and |j4|,|B| for their dimensions. For a pure state \ip AB ) G AB, we define 
i/j ab := \ijj AB ) (ip AB \, while will denote trB{ip AB ), where tr# is the partial trace over subsystem 
B. We denote the <i-dimensional unitary group by U(d). We define the entropy deviation from its 
maximal value of a state p e V{C d ) by 8S(p) := log(d) - S(p). Let S n := {x G R n+1 : \\x\\ 2 = 1} 
denote the Euclidean sphere in IR n+1 and p denote the normalized rotationally invariant mea- 
sure in S n (the Haar measure). Finally, the Bachmann-Landau notation g(n) = o(f(n)) stands for 
V/c > 0, 3tiq : Vn > no, g(n) < kf(n). 

Structure of the paper: In sectionHDwe present the main results of the paper as well as the key 
definitions used in the proofs. The counterexamples to the additivity conjecture are given by the 
combination of three propositions 111.61 111.71 and 111.81 which are proven in sections Hn] |lVl and IVl 
respectively. 
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II. DEFINITIONS AND MAIN RESULTS 



We will consider channels from A to B of the form 

£{p) = tr A (lJ {p A ®\Q){0\ B )rf) (10) 



for a unitary U € U(|.A| |2?|). The channels thus have input and environment dimensions equal to 
\A\ and output dimension equals to \B\. Moreover, we will make use the conjugate channel of £, 
defined as 

£(p)=tr A {U*{p A ®\0)(0\ B )U T ). (11) 

The counterexamples to the minimum output entropy additivity conjecture will be constructed 
by selecting the unitary U at random from the Haar measure in U(|^4||S|) and considering the 
regime of a very large environment dimension \A\ S> \B\. 

Throughout the paper cq > will denote a fixed constant which can be taken to be e.g. cq = 
1333, while the Landau notation o(l) will stand for a term which can be taken as small as desired 
by choosing \A\ large enough. Hastings theorem can be stated as follows. 

Theorem I For U drawn from the Haar measure in U(\A\\B\), consider a channel as in Eq. ([TUP . Then, 
for c > Co, with probability 1 — o(l), 

S mi n(£ ®£)< S miQ (£) + S min (£) - l0gl ^,~ 2C - (12) 

We will prove Theorem Q] by the combination of two results. The first, analogous to a similar 
result of Winter and Hayden lf28l, I29I, 3C ] on Renyi entropies, delivers an upper bound on the 



minimum output entropy of £ (g) £ by considering the output entropy of the canonical maximally 
entangled state in A ® B as an input. 



Lemma II.l For a channel given by Eq. (uOD , 

Smm(£<8>£)< 2\og\B\- 1 ^^. (13) 

\Jd\ 

For completeness, we reproduce the proof of Lemma flll.l[) in Appendix [Dl 

The second result is a probabilistic argument for the existence of channels with high minimum 



output entropy. This is Hastings breakthrough contribution to the problem \2a\ . 



Lemma II.2 For U drawn from the Haar measure in V(\A\ \B\), consider a channel as in Eq. (fTOT ). Then, 
for c > cq, with probability 1 — o(l), 

Smin{£) > log |J3| - t^t. (14) 
\JD I 

The main idea in the proof of Lemma [TL2] is to look at the probability that the output of a Haar 
random input state is close to a low entropy state (with entropy smaller than log \B\ — c/\B\). On 
one hand, we will show that for = o(l), this probability is upper bounded by exp(— c-fC|^4|) 

(with K > a constant), for a Haar random choice of the channel unitary. On the other hand, we 
will compute a lower bound on this probability, conditioned on the minimum output entropy of 
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the channel being small; in this way we will get a lower bound of order exp(— ln(2)|A|). Putting 
these two estimates together we obtain Lemma [11.21 

There are two key conceptual insights necessary to turn the idea of the previous paragraph 
into a proof. The first is to define an appropriate notion of closeness, when quantifying how close 
a state is to a low entropy one. For this, Hastings introduced the concept of a tube around a state 
|40f|. which will take a central role in the proof of Lemma [11.21 



Definition II.3 We define the tube around a G V(C D ) with width parameter N > as 



TUBE(a, N) := I ir G V(C U ) : 3 - < p < 1 s.t. 



7r - ( pa + (1 — p) 



< 



log(iV) 



N 



(15) 



We will be interested in the probability that the output of a random input state, over a random 
choice of the channel, is in the tube of a low entropy state. The set of such states is formalized in 
the next definition. 



Definition II.4 For constants N, c > 0, we define the set of states in the tube of a low entropy state as 
X D .N, C := {p G V(C D ) : 3 a G V(C D ) with SS(a) > c/D s.t. p G TUBE(a,N) } . (16) 

The second insight is to consider the probability only of a particular subset of the set of states 
close to a low entropy state. We will look at the intersection of Xd,n,c with the set of states of 
small operator norm. While this restriction will affect only very mildly the lower bound on the 
probability we are ultimately interesting in analyzing, it will allow us to get a much improved 
upper bound on it. 

Definition II.5 For a constant a > 1, we define the set of states with bounded operator norm as 

Y D>a :={peV{C D ):\\p\\ 00 <^). (17) 

We are now in position to state precisely the two propositions which will be the focus of the 
remainder of the paper. 

Let | x) £ A be a Haar random state and £ be a channel given by Eq. (fTUt with U drawn from 
the Haar measure in U(| A\ \B\). Then, for \A\ > |i?| 2 ,wehave 

Proposition II.6 

Pr (£(x) G X mAlc n Y ]Bla ) < exp (-^ + o(l)|A|) . (18) 

Moreover, for log \A\ > 8\B\ S and a > 15, 
Proposition II.7 

Pr {Six) e X mAlc n Y| fl | >a ) > exp(- ln(2)|A|) (pr (sS min > ^| ) " °U)) • (19) 
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Combining these two results we get Lemma ril.2l by choosing c > 128 ln(2)a and a = 15. 

We will derive Proposition 111.61 from a new large deviation bound, which we believe might be 
of independent interest. It shows that with high probability the reduced state tp A of a random 
bipartite state \ip AB ) is close, in two norm, to the maximally mixed state. Although similar results 
are well-known (see e.g. 1351. 13611 ), the restriction to reduced states ^ B with a small operator norm 
will allow us to sharpen the exponential bound essentially by a factor of \B\; this improvement 
turns out to be crucial in proving Proposition lII.61 By a measure concentration argument we prove 
in section [V] the following 



Proposition II.8 For \tjj ) G A® B drawn from the Haar measure, \A\ > \B\ and a > 3, 



Pr 



LB 



> e and Tp B G Y\ B \ a ) < 4exp 



A\\B\ 2 [e-2\A\-2 



64a 



(20) 



III. PROOF OF PROPOSITION|ir6] 



In this section we prove Proposition 111.61 The idea is to combine Proposition III.8I and the fol- 
lowing simple lemma relating the entropy deviation from its maximal value to the distance to the 
maximally mixed state. 



Lemma III.l For every a G V{C U ), 



a 



D 



> 



logCD) - S(a) 
D 



(21) 



Proof We have 



S(p) > -log(tr(p 2 )) 

= -log( J Dtr( / 9 2 ))+log(,D) 
> l-Z)tr(p 2 ) + log(Z)), 



(22) 



where the first inequality follows from the concavity of the log and the second from the relation 
log(x) < x — 1, valid for x > 1. Rearranging terms in Eq. ((22)) , we find Eq. ((2T|) . □ 



Proof (Proposition lII.6[) 

Let \ip AB ) be such that >ip B G X\ B ^ C . Then there is a a with 5S(a) > c/\B\ such that ip B G 
TUBE(cr, |A|). From Lemma|TlLl]we get 



\B\ 



> 



\B\' 



(23) 



As | IV s -{pa + (l-p)I/|J3|)| |oo < with p> 1/2, Eq. © gives 



^ - i— r 
LB 



> 



'log |^| 



2I-BI 



(24) 
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where we used 



< 



A moment of thought reveals that the distribution of Six), for random S and \, is the same 
as the distribution of the reduced density matrix i]j b of a random bipartite state \ip) AB G A ® B. 
Therefore, from the argument of the previous paragraph 



Pr (six) £ X\B\,\A\, c nY lBl , a ) = Pi(i; B ex lBUAlc nY lBl;a ) 



(25) 



< Pr 



LB 



> 



2 LB 



The result now follows from Proposition lII.8l 



□ 



IV. PROOF OF PROPOSmON[LL7] 

On general lines, the idea of the lower bound given by Proposition III . 71 is the following. Let 
P be the probability that a random channel has minimum output entropy bigger than log \B\ — 
c/\B\. For a given channel S, let xs De a pure input state to £ with minimum output entropy, 
i.e. a state which satisfies S(£(xs)) = S m i n (£). We will show that with probability larger than 
0(exp(— ln(2)|^4|)), £{\) is in the tube of £{xe), for a random choice of the input state |x)- From 
this we can conclude that £{x) is in the tube of a low entropy state with probability bigger than 
(l-P)fi(exp(-ln(2)|4)). 

This is almost all there is to show, except that from the argument of the previous paragraph, we 
have no guarantee that the states £(x) which we have proven to be in the tube of a low entropy 
state also belong to Y\ B ^ a . To overcome this difficulty, we employ a large deviation bound due to 
Harrow, Hayden, and Leung [35] (Lemma lC.ll in Appendix[C]) which shows that with probability 
bigger than 1 — exp(— \A\), £(x) belongs Y\B\,a- This lemma thus allows us to disregard states not 
in Yj B | a for sufficiently large \A\. 



Proof (Proposition lII.7|) 

The first step in the proof is to eliminate the event £(x) £ Y\ B ^ a . For this, we first use Lemma 
lA.ll of Appendix [A] to get 

Pr (S(x) G *|B|,|A|,c n Y ma ) > Pr (Six) G *|B|,|A|,c) - Pr (S (x) t ^B|, a ) . (26) 
Then, from Lemma |C. II of ApendixICl 

S * W <- (™) 2|B| -P ("W^^) * «P (-Ml) , (27) 

for a > 15 and |A| > 2\B\ 111(2151). 

In the remainder of the proof we show that 

Pr (Six) G *|b|,|A|,c) > gj^| exp(-ln(2)|A|) (pr (<J5 min > ^j) " • (28) 

The result then follows from Eqs. <[26]), {27}, and (gH). 
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Let us define as ■= £(xs)> with xs an input to £ with minimum output entropy, i.e. a state 
such that S(£(xe)) = S m i n (£). From the definition of the set -X|b|,|A|,c we find 



Pr (£( X ) G X lBm>c ) > Pr (s(x) € TUBE(a £ , |A|) and 5S min (£) > ^ 
£>x £,x \ |-d| 



(29) 



We now proceed to bound the right-hand-side of Eq. (T29)) . Following 13411 , 



Pr G TUBE(<T£, |A|) and 5S min (£) > — 



£,x 



B 



E £ { 1 ( <5S min (£) > — J Pr (£{ x ) € TUBE(£(a £ , \A\)) ) , 



(30) 



where l(5S m - m (£) > c/\B\)) is the indicator function of the event (only over channels): 

{5S min (£) > c/\B\}. 

Let us consider the probability over states inside the expectation value in Eq. ((30) . For a Haar 
random \ x) G A, we can write 



\x) = V^\xe) + VT 



(31) 



where ir = | (X£" I x) 1 2 an d \4>) is a state orthogonal to \xe)- In Lemma lA.2l of Appendix lAl we prove 
that x and \<p) are independent random variables and that \4>) is distributed accordingly to the 
Haar measure in the subspace of A orthogonal to \tp). Therefore, 



Pr (£(x) G TUBE{£(a £ ), \A\)) > Pr (x > 1/2) Pr (F f] G) , 

x <t> 



(32) 



where 

F :-- 



\£(\xs 



< 1 / log \A\ 



G :-- 



1 logL4| 



(33) 



Indeed, note that if x > 1/2 and F, G hold true 



£(x) ~ x£(xe) - (1 - x)— - 



< nz{\xs 



+ 



< 



'\og\A\ 



\A\ 



(34) 



which implies £( X ) G TUBE(£(<t £ ), \A\). 

In Lemma TlV. 1 1 we use a simple geometric argument to show 



Pr(|<X£|x>r> - J >- Z[ exp(-ln(2)|A|) 



Then, from Eqs. (|30]l and ((32 



(35) 



Pr Six) G TUBE(£((7£), |4|) and <5S min (£) > — 



B 



1 

Pi 
i 

8L4J 



> — exp (- ln(2)|A|) Eg ( 1 ( <5S mi „(£) > — J Pr (F n G) 



\B\ 



exp (- ln(2)|A|) Pr ( F n G n ( 5S mi n(£) > 



(36) 
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From Lemma [A. II we can bound the second term in the last line of the equation above as 

Pr (f n G n (SS^E) > ) > Pr (sS^(S) > ^\ - Pr (F c ) - Pr (G c ) . (37) 

Eq. j28l ) now follows from Lemma flV.21 where we prove that Pi£ tX (F c ), Prg :X (G c ) = o(l), asymp- 
totically in \A | . □ 

Lemma IV.l Let £ Abe a fixed state and \ x) £ Abe drawn from the Haar measure. Then, 

Pr(|(Vlx>| 2 >£) >^exp(-|^|ln2) (38) 

Proof 

The vectors \x) can be seen as points (x±, . . . , x n ) on real unit sphere S™" 1 with n = 2\A\. The 
Haar measure is thus the normalized area of the sphere and the condition | {ip\x) 1 2 > 1/2 reads as 
x\ + x\ > 1/2. 

Clearly Pr(x^ + x\ > 1/2) is lower bounded by Pr(a^ > 1/2), which equals to the ratio of the 
area of a polar cap determined by the condition x\ > 1/2 and the volume of the sphere. The area 
of the cap is in turn lower bounded by the volume of an (n — 1) -dimensional ball given by the 
condition x\ + . . . + x\ < 1/2 (the projection of the cap onto a subspace perpendicular to the x\ 
axis). Invoking explicit formulas for the volume of a ball and the area of a sphere (see e.g. ^^\), 
we obtain 

Pr(l( * )|2 £ 1/2) £ - <k e ~'" m,A> - (39) 

□ 

Lemma IV.2 Let \ip) be a fixed state in A, \<f>) be drawn from the Haar measure in the subspace of A 
orthogonal to \ip) and £ be a channel as in Eq. (flOl ), with U drawn from the Haar measure in V{\A\\B\). 
Define 



F:=hS(\m\)U<y i ^-\, G:- 



Then, for log \A\ > 8\B\ 8 there are constants C\, C2 > such that 



Pr(F)>l-exp(-^^), Pr(G)>l-exp(-C 2 logL4|). (41) 

Proof Let us start with the bound on the probability of F c . Consider the complementary channel 
of £, defined by £ c {p) := tr B (U (p A <g> |0)(0| B ) U ] ). Noting that £ c is a channel with input and 
output dimension \A\ and environment dimension \B\, we can write 

£ c (p) = J2A kP Al (42) 

k=l 

for Kraus operators A k such that J2k A\A k = I. Thus 

\B\ 2 |B| 2 

£ (P) = J2J2 t < A l A kP)\k)(k'\, (43) 

k=i k'=i 
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from which we find 



< |5| 4 max|(0|4,A fe 
k,k' 



(44) 



Let k max ,k' mBX be the optimal indices in the equation above and define \9) 



4 A 

"'max 

hence 



1 / 2 . As Halloo < 1 for all k, we get ^ fcn 



: 1 and 



We thus have 



\£ 



Pr (F c ) < Pr 



< 151 



> 



1 



4I5I 4 ' 



f log \A\ 



(45) 



(46) 



Applying Lemma TlV.31 to the equation above we find 

1 



Pr 



> 



4I5I 4 ' 



log \A\ \ ( K\og\A\ 

< 2exp 



|S|8 



(47) 



for log | A | > 8|-B| 8 and a constant i^T > 0. This gives the bound on Pr(F) given in Eq. |4"T1) . 

Let us now turn to the bound on the probability of G. From Lemma [A.2I of Appendix [Al we 
can select \<p) by drawing \\) € A from the Haar measure and setting \x) = + a/1 — x \4>)- 

Then we have 



m - m 



-xlli + 



< 



(48) 



where the first inequality follows from the triangle inequality and the second from the fact that 
Halloo < ll-^lli an d the monotonicity of the trace norm under trace preserving CP maps. There- 
fore, 



Pr (GO > Pr 

£,<t> £,x 



X\\i < y iQg -. and 



4 V \ A 

From Lemma [A.ll of Appendix [Al in turn, 



< 1 / log \A\ 



(49) 



Pr (G) > 1 — Pr 

£,4> S,x 



£(x) 



\B\ 



> 



1 \or\A\ 



\A\ 



(50) 



One one hand, we have [|0 — xlli < V^-WMxW = \Jlx(2 - x) < = 2|(^|x)|. Follow- 
ing 134TJ , we find that if we replace |0) by \x), then with high probability it will only incur in a 
small error. Indeed, from Lemma ttV.31 



Pr ( ||^-xlli > W^rjp) ^ 2exp(-Kl g|^|) 



(51) 
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for a constant K > 0. 

On the other hand, from Lemma [C. II of section O 



oo 



1 /logL4|\ / log |A| 

-4V^4T " eXP r560M2)J- 



Combining these two last equations with Eq. ((50|) , we find the lower bound on Pr(G) given in Eq. 
(HQ. □ 

Lemma IV.3 Lef 5 C A be a \S\-dimensional subspace of A and let P$ be the projector onto S. For 
\4>) € S drawn from the Haar measure in S and a fixed \9) e A, 

pJ\(e\cf>)\>-L=+e) <4expf-^V (53) 



\S\ J- V 16 

Proof We prove the lemma by applying Levy's lemma, given in Lemma IV. II of section [V] with 
f{\4>)) '■= \ (9\4>)\. On one hand, we have 

IE(/(l^)) 2 ) = ^l(|^)l^)<^- (54) 

Then, from the convexity of x 2 ,E (/(|^))) 2 < E (/(|</>)) 2 ) < |5| _1 . On the other hand, the Lipschitz 
constant of / is easily seen to be unity. The result then follows easily from Lemma [V. II □ 

Remark: We note that in the proof of Proposition III. 71 we set the input dimension \A\ to be 
exponentially larger than the output dimension \B\; this is due to the factor of log |A|/|^4| in the 
definition of the tube. We could have instead defined the width of the tube as /(|.A|)/|.A| for 
any function / sublinear in In this way we can get a much better dependence of the input 
dimension | A \ with the output dimension \ B\. Besides that, as in Hastings' original proof, we have 
used equal input and environment dimensions. However, our approach allow us to consider the 
general case in essentially the same fashion. In pricinple, this could lead to a better scaling of the 
minimal dimensions for which counterexamples can be shown to exist. 



V. PROOF OF PROPOSITION [US] 

LetS n := {x G M n+1 : ||rE|| 2 = 1} denote the Euclidean sphere in and \i denote the normal- 
ized rotationally invariant measure in S n (the Haar measure). Our strategy to prove Proposition 



111.81 is to explore the measure concentration phenomenon in high dimensional spheres 133,1380. 
For a subset A C §>™, define the e-neighborhood of A as 

A e := {y e S n : 3 x G A s.t. \\x-y\\ 2 < e}. (55) 



Theorem II (Concentration of Measure in S n i3A\3ffl) Let A c § n and < e < 1. If fj,(A) > 1/2, then 
fx(A e ) > l-4exp (-^i 
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This theorem says that the area of S n is sharply concentrated around any set with measure 
bigger than 1/2. A simple but very powerful corollary of Theorem HI1 says that slowly varying 
functions on S n attain a value very close to its average almost everywhere (see e.g. j3^] for appli- 
cations to quantum information theory). This is the content of Levy's Lemma. 
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Lemma V.l (Levy's Lemma h3A\3al ) Let f : S n 

x G S n be chosen uniformly at random. Then 



Pr(|/(x) -E/| > a) < 4exp 



fee a function with Lipschitz constant rj and a point 
(n+ l)a 2 ' 



16r] 2 



(56) 



Given a Haar distributed state \ip) G ^4, we can see it as an Haar distributed point in § 2 I A I x . 
Therefore the lemma above applies to Haar pure states as well. 

The proof of Proposition 111.81 will follow closely the standard argument for deriving Levy's 
Lemma (see e.g. 1371, 138I1 ). An important difference is that we are only interested in establishing a 
large deviation bound for a particular subset of the state space, namely for states \ip ) whose the 
reduced state tp B has operator norm bounded by a/\B\. Such a restriction will allow us to use an 



improved bound on the Lipschitz constant of the function g(\ip AB }) ■ 



B 



the exponential bound appearing in Levy's Lemma by a factor of |B|/(4o). 



and sharpen 



Proof (Proposition lII.8[) Define 



g(\i> AB )) ■ 



B 



I 

LSI 



(57) 



Note that g is a function from S 2 ^" 5 ' -1 to R. Let m(g) be the median of g and set M := {\ip AB ) 
g(\ip AB }) < m(g)}. In Lemma lV.3l we show m(a) < 2\A\~2. Thus for every \ip AB ) G M, we have 



'> B \\l <v^r+m(g) 2 < 



1 



4 

\A\ 



(58) 



An application of Lemma IB. II of Appendix [B] with A = af\B\ then gives the following bound on 
the operator norm of states in M, 



a 



~ LB - B 



(59) 



for every ip AB G M and \A\ > \B\ 2 and a > 3. 
Consider a state \i/j ab ) such that 

g{\^ AB )) >m(g)+P and ||^||oo < a/\B\. (60) 
Because of the bound on the operator norm of ip B , we can use Lemma [V.2I to find from the first 

inequality of Eq. (|60)) that ip AB must be at least (3\J~^ away from M. Furthermore, by definition 
of the median, fJ,(M) > 1/2. Therefore from TheoremHTl 

\A\\B\\e-m{g)f 



Pr 



B 



\B\ 



>e and i(; B G Y m a ) < l-/x ( A 



(t-m(g))^\B\/4a 



) < exp 



64a 



(61) 
□ 



and we are done. 

The next lemma shows that for states with operator norm bounded by a/B, the Lipschitz 
constant of the function g is improved by a factor of y/\B\/(4a). 



Lemma V.2 Let \ip AB ), \<p AB ) &A®Bbe such that 



II^IU <a/|B|. Then 



I 

LBI 



lB 



I 

LBI 




AB\ 



lAB\ 



(62) 
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Proof We assume without loss of generality that 1 1 ip B - 1/ \ B \ \ | 2 > 1 1 cp B - 1/ \ B \ \ | 2 . Let { | i ) j™" 1 *^ } 
be an eigenbasis for i/j b and define M(p) := ^i{i\p\i)\i){i\- Then, 



ip B - -—- 
B 



LB 



\B\ 



I 






W\ 


2 


\ B \ 



< 



M(<f> B ) - 



< \\M(i; B )-M(4> B )\\ 2 , (63) 

where the first inequality follows from Lemma IV.41 and the second inequality from the triangle 
inequality. 

Let {pk}k arid {qk}k be the eigenvalues of M{ip B ) = if) B and M(4> B ), respectively. Since 
I \^ B I |oo, H^Hoo < a /\B\, we find from Lemma IV.41 that (maxk Pk) , (maxj q^) < a/\B\. Hence 

^2(Pk ~ Qk) 

k 



\\M^ B ) - M{4> B )\\l ~ 



(64) 



'Qkj 



/Pk + VOk 



< 



k 

4a 

W\ 

4a 



'Pk - VQk) 



— (2-2F(M(^),M(^))) 



< 



\B : 
4a 
LBI 



(2-2F(^,^)) 



< ^(2-2F(^,0) = ^tI"^ 



where the last two inequalities follows from the monotonicity of the fidelity under trace preserv- 
ing CP maps. Putting Eqs. (|63)l and ((64)) together gives the result. □ 

The next lemma gives an upper bound on the median of the function g. 
Lemma V.3 Let g : $ 2 \ A W B \ -^Rbe such that 



I 

\B\ 



g(\V B )):= V B 

and m(g) be the median of g. Then m(g) < 2\A\~^. 
Proof We start by bounding the median by the expectation value of g as follows 



(65) 



Eg 



g{^)p{dip) + 



9<m(g) 



g(ip)n(dip) > rn(g) 



H(dijj) 



m(g) 



We proceed by lower bounding the expectation value of g(\ip}), 

(Egf < E(g 2 ) = E (tr((^) 2 )) - = tr (e (V* ^ A ' B ') I AA , ® E BB ') - ± 



(66) 



(67) 



where E BB ' is the swap operator the two systems BB'. The first inequality of the equation above 
follows from the convexity of x 2 . From Schur's Lemma, 

jAA'BB' , ¥ AA> ^ ¥ BB' 



AD A / 7~> 
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Putting Eqs. (|6Zjl and (|68]> together gives m(g) < 2\A\ 2 . □ 

The final lemma of this section shows the monotonicity of the operator and two norms under 
pinching. 



Lemma V.4 For every X, 



\X\\o > 



for orthogonal projectors Pk with Pk = 1- 
Proof Direct calculation. 



00 > 



^PkXPk 



(69) 



□ 



VI. ACKNOWLEDGEMENT 

We thank Robert Alicki for sharing his analysis of Hastings' paper (which triggered us to un- 
dertake a similar study) and Graeme Smith for providing the proof of Lemma [A.2l This work was 
supported by EC IP SCALA and an EPSRC Postdoctoral Fellowship for Theoretical Physics. FB 
would like to thank the hospitality of the members of the National Quantum Information Centre 
of Gdansk, where this work was done. 



APPENDIX A: A FEW PROBABILITY FACTS 



Lemma A.l For two events M, N, Pr(M n N) > Pr(M) — Pr(A^ c ), where N c is the complement of N. 
Proof We have 

Pr(M) = Pr(M n N) + Pr(M n N c ) < Pr(M n N) + Pv(N c ). (Al) 
Rearranging terms in the equation above gives the result of the lemma. □ 
Lemma A.2 Let \\) £ Abe drawn from the Haar measure. Write 

\ X ) = v/£|V) + VT=4£>, (A2) 



where G A is a fixed state, x 
independent random variables and 
orthogonal to\ip). 



KV'lx)! 2 / and \4>) is a state orthogonal to Then x and \4>) are 
) is distributed accordingly to the Haar measure in the subspace of A 



Proof Let pA.(\ip)) be the probability density function associated with the Haar measure in A. We 
can write pa{\iP}) = Pa(%, !</>))• From the invariance of the Haar measure under unitary transfor- 
mations, pa(U\i/))) = pa{x, U\4>)), for every x and every unitary U which acts non-trivially only in 



the subspace of A orthogonal to\ip), A 



pa(\4>)\x) 



Therefore, the conditional probability density function 

pa(\4>),x) 



Pa(x) 



(A3) 



is such that pa{U\4>) \ x) = pa{\4>) I x ) f° r every x and unitary U acting on A^± . From the unique- 
ness of the Haar measure, we find that for every x, pa{\4>) I x ) = PA^ ± (l^))- This shows both that 
\(p) is independent of x and that it is Haar distributed. □ 
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APPENDIX B: RELATING OPERATOR NORM, TWO NORM, AND ENTROPY 

Lemma B.l Let p £ V(C D ) be such that {{p^ > A > 1/D. Then 

S(p) < s(X, D) := (1 - A) log(£> - 1) + h(X), (Bl) 

and 

llpi>A 2 + %^, (B2) 
where h(x) := — xlogx — (1 — x) log(l — x) is the Shannon binary entropy. 
Proof Let Aj be the eigenvalues of p in decreasing order. Then, for every N e {1, D}, 

^ Ai > Al + (iV -l)i_^, (B3) 

i=i 

which shows that {A;} is majorized by the probability distribution q := |Ai, , 
From the Schur convexity of x log x, 

S(p) < S(q) = s(Xi,D) := (1 - Ai) log(D - 1) + /»(Ai). (B4) 

A simple calculation shows that dS g^ < for all p > 1/D. Therefore, the function s(X,D) is 
monotonic decreasing in A for A > 1/D. As Ai = \\p\\oo > A, we find that S(p) < s(A, D). 

The bound on the two norm can be obtained in an analogous way. As x 2 is Schur convex, we 
get that 

\\p\\l>h\\l = r{X u D):=Xl + ^ w ^f. (B5) 

A simple calculation shows that r(Ai, D) is monotonic increasing in Ai, so that r(Ai, D) > r(A, D). 

□ 



APPENDIX C: LARGE DEVIATION BOUND FOR THE OPERATOR NORM 



The following lemma, due to Harrow, Hayden, and Leung |35f| is used twice in the proof of 
Proposition 111.71 



Lemma C.l (Lemma 111 A of il361/ ) Let \ifj AB ) e A <g> B be drawn from the Haar measure. For every 
< e < 1, 



Pr(||V B |L >— + —)< 



while for every e > 43 
Pr I \\ib B „ 

~ \ B \ \ B \J ~ V £ J V 141n(2) 



Pr , > + ±.\ < ( m\ m exp r-i V £ -'°f'Lt £)> ) , <C2) 
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APPENDIX D: PROOF OF LEMMAUTT] 



Following Refs. |28l. BO ], we use the canonical maximally entangled state \<& AA ' 
zCi=i \i) A \i) A ' as an input state to 



£ ® £(p) = tr AA , ( (£/ ® U*) ( P AA ® |0><0|" 8) |0><0|"' ) (U ET) 



(Dl) 



where J7 acts on AB and J7* on A'B'. 

We can get a lower bound on the operator norm of £ ® £($ AA ') as follows 



> tr($ BB £®£($ AA 



tr (V B 'tr4 A , f(C7 ® U*) U AA ' ® |0)(0| B ® |0)(0| B ') (C7 ® IT*) 1 



tr I 



rAA' <J>BB' 



(u ® cr) |o)(o| a |o><orj (tf ® ET) 

> tr (V A ' ® $ BB ' (([/ ® U*) 0f> AA ' ® |0>(0| B ® |0>(0| B ') (U ® 
= tr (([/ ET)t$ AA ' $ BB '([7 CT) (V A ' ® |0>(0| B ® |0>(0| B ')) 
( = } tr ($ AA ' ® $ BB ' (V A ' |0>(0| B ® |0>(0| B ')) > (D2) 



B' 



In (z) we used < I, while (ii) follows from the identity 

' X c f®I°') \<f> cc '). 



l c ®X c ') \<5> cc ' 



Applying Lemma IrTTI to £ ® £(<& AA '), with D = \B\ 2 and A = l-B]" 1 then gives 



S (£®£(<S> AA ')^J <s(|J3r 1 ,|B| 2 ) = 21og|J3| 



log |S| 
\B\ 



(D3) 
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